{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d15d8294-3328-4e07-ad16-8a03e9bbfdb9",
   "metadata": {},
   "source": [
    "# Don't get sideswiped by a low num_ctx value!\n",
    "\n",
    "Try summarizing a long web page such as https://cnn.com with with llama3.2  and keep an eye on the `ollama serve` output. You may see a \"truncating input prompt\" warning that looks like this:\n",
    "\n",
    "```\n",
    "time=2025-01-05T17:16:18.172-08:00 level=WARN source=runner.go:129 msg=\"truncating input prompt\" limit=2048 prompt=3940 keep=5 new=2048\n",
    "```\n",
    "\n",
    "According to https://github.com/ollama/ollama/blob/main/docs/modelfile.md, Ollama defaults the context window size to 2048. When the above warning is generated, the results returned by the model are pretty poor. It looks like the scraped webpage is being truncated before it is fed to the model.\n",
    "\n",
    "Unfortunately, it is pretty easy to miss this warning in the `ollama serve` output. On the upside, bumping up num_ctx is easy to do – just specify the num_ctx option.\n",
    "\n",
    "If you are using `ollama.chat()`:\n",
    "\n",
    "```\n",
    "response = ollama.chat(model=MODEL, messages=messages, options={\"num_ctx\": 4096})\n",
    "```\n",
    "\n",
    "See the following thread in the course Q&A: https://indeed.udemy.com/course/llm-engineering-master-ai-and-large-language-models/learn/lecture/46869605#questions/22842263"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c5868fa7-18b7-4a9e-a271-7cc28f79fb01",
   "metadata": {},
   "outputs": [],
   "source": [
    "# imports\n",
    "\n",
    "import requests\n",
    "from bs4 import BeautifulSoup\n",
    "from IPython.display import Markdown, display\n",
    "import ollama\n",
    "from openai import OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49edee91-2c37-432a-8afb-95d90537df74",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Constants\n",
    "\n",
    "OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
    "HEADERS = {\"Content-Type\": \"application/json\"}\n",
    "MODEL = \"llama3.2\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a27b061f-640a-454b-b700-1d1219fa33fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's just make sure the model is loaded\n",
    "\n",
    "!ollama pull llama3.2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1622d9bb-5c68-4d4e-9ca4-b492c751f898",
   "metadata": {},
   "source": [
    "# The exercise using ollama.chat()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "402d5686-4e76-4110-b65a-b3906c35c0a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# A class to represent a web page\n",
    "\n",
    "# Some websites need you to use proper headers when fetching them:\n",
    "headers = {\n",
    " \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
    "}\n",
    "\n",
    "class Website:\n",
    "\n",
    "    def __init__(self, url):\n",
    "        \"\"\"\n",
    "        Create this Website object from the given url using the BeautifulSoup library\n",
    "        \"\"\"\n",
    "        self.url = url\n",
    "        response = requests.get(url, headers=headers)\n",
    "        soup = BeautifulSoup(response.content, 'html.parser')\n",
    "        self.title = soup.title.string if soup.title else \"No title found\"\n",
    "        for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
    "            irrelevant.decompose()\n",
    "        self.text = soup.body.get_text(separator=\"\\n\", strip=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3249b38-c686-4c1e-9c77-78037c03ae54",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The messages (system and user prompts) to send to Ollama\n",
    "\n",
    "system_prompt = \"You are an assistant that analyzes the contents of a website \\\n",
    "and provides a short summary, ignoring text that might be navigation related. \\\n",
    "Respond in markdown.\"\n",
    "\n",
    "def user_prompt_for(website):\n",
    "    user_prompt = f\"You are looking at a website titled {website.title}\"\n",
    "    user_prompt += \"\\nThe contents of this website is as follows; \\\n",
    "please provide a short summary of this website in markdown. \\\n",
    "If it includes news or announcements, then summarize these too.\\n\\n\"\n",
    "    user_prompt += website.text\n",
    "    return user_prompt\n",
    "\n",
    "def messages_for(website):\n",
    "    return [\n",
    "        {\"role\": \"system\", \"content\": system_prompt},\n",
    "        {\"role\": \"user\", \"content\": user_prompt_for(website)}\n",
    "    ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5f45cfd-7089-4e61-a13b-14469f8161ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The code to call Ollama and summarize a website\n",
    "# Remove options={\"num_ctx\": 4096} to see the \"truncating input prompt\" warning\n",
    "\n",
    "def summarize(url):\n",
    "    website = Website(url)\n",
    "    messages = messages_for(website)\n",
    "    response = ollama.chat(model=MODEL, messages=messages, options={\"num_ctx\": 4096})\n",
    "    return response['message']['content']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74efa05b-47c0-449a-974e-0d9c042a83e2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The code to display a website summary in Markdown\n",
    "\n",
    "def display_summary(summary):\n",
    "    display(Markdown(summary))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a86418c-e07e-44e0-8c96-1109552ea5a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now summarize an actual website\n",
    "\n",
    "url = \"https://cnn.com\"\n",
    "summary = summarize(url)\n",
    "display_summary(summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d2966a6-8c5d-42f1-ad4a-0571224d8b38",
   "metadata": {},
   "source": [
    "# The exercise using ollama_via_openai.chat.completions.create()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7aa6f771-97b2-44d9-9c4c-4f0d8f2b716a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The OpenAI API is subject to the same default 2048 num_ctx issue. Unfortunately, upping num_ctx\n",
    "# isn't as straightforward with this API. See https://github.com/ollama/ollama/blob/main/docs/openai.md\n",
    "# for a way to create a new model with its num_ctx bumped up. Once you have the new model you can specify\n",
    "# it in the chat.completions.create() arguments.\n",
    "\n",
    "ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')\n",
    "\n",
    "response = ollama_via_openai.chat.completions.create(\n",
    "    model=MODEL,       # Where you can specify your new model once you create it\n",
    "    messages=messages_for(Website(\"https://cnn.com\")),\n",
    "    # max_tokens=4096  # This doesn't work\n",
    ")\n",
    "\n",
    "display_summary(response.choices[0].message.content)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
